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ABSTRACT 



Encoded data embedded in an iconic, or reduced size; 
version of an original text image is decoded and used in a 
variety of document image management applications to 
provide input to, or to control the functionality of, an 
application. The iconic image may be printed in a suitable 
place (e.g., the margin or other background region) in the 
original text image so that a text image so annotated will 
then always cany the embedded data in subsequent copies 
made from the annotated original. The iconic image may 
also be used as part of a graphical user interface as a 
surrogate for the original text image. An encoding operation 
encodes the data unobtrusively in the form of rectangular 
blocks that have a foreground color and size dimensions 
proportional to the iconic image so that when placed in the 
iconic image in horizontal lines, the blocks appear to a 
viewer to be representative of the text portion of the original 
image that they replace. Several embodiments are 
illustrated, including using the iconic image as a document 
surrogate for the original text image for data base retrieval 
operations. The iconic image may also be used in conjunc- 
tion with the original text image for purposes of authenti- 
cating the original document using a digital signature 
encoded in the iconic image, or for purposes of controlling 
the authorized distribution of the document. The iconic 
image may also carry data about the original image that may 
be used to enhance the performance and accuracy of a 
subsequent character recognition operation. 

12 Claims, 20 Drawing Sheets 
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PERFORMING DOCUMENT IMAGE 
MANAGEMENT TASKS USING AN ICONIC 
IMAGE HAVING EMBEDDED ENCODED 
INFORMATION 

CROSS REFERENCE TO RELATED 5 
APPLICATIONS 

The present invention is related to an invention that is the 
subject matter of a previously filed, copending, commonly 
assigned U.S. patent application having the same inventor as 10 
the subject application and having the following serial 
number and title: Application Ser. No. 08/671,423, "Embed- 
ding Encoded Information in an Iconic Version of a Text 
Image". Application Ser. No. 08/671,423 is hereby incorpo- 
rated by reference herein as if set out in full, and will be 15 
subsequently referred to herein as "the Iconic Image Encod- 
ing application." 

BACKGROUND OF THE INVENTION 

The present invention relates generally to a processor- 20 
based technique in the field of document image 
management, and more particularly, to document image 
management applications that make use of an iconic image 
version of a document (text) image that includes encoded 
binary information embedded in the iconic image to perform 25 
document image management tasks related to the text 
image. 

A common practice in computer-implemented graphical 
user interfaces is to use small graphical images called 
"icons" to represent software applications and functions. 30 
The advantages of using icons have been applied to the 
domain of images, and reduced-size versions of images, 
often called 4 'thumbnail" images, have been used in several 
contexts. In a reduced version of an image, the characteristic 
page layout appearance of the full size page is preserved and 35 
objects are proportionally reduced and placed in positions in 
the thumbnail image that are substantially equivalent to 
positions in the full size version of the image. The preser- 
vation of the page layout features of the full size version of 
the image, such as the margin dimensions, the placement of 40 
headers and footers, the spacing between paragraphs and of 
lines within paragraphs, the presence or absence of text 
justification, and the proportional reduction of text in vari- 
ous font sizes, all contribute to producing a thumbnail image 
which, because of human pattern matching abilities, is easily 45 
recognizable to a viewer as representative of the full size 
image. A reduced sized version of an original image that 
substantially preserves visually significant page layout fea- 
tures of the full size version of the image will be referred to 
herein as an iconic version of the original image, or simply 50 
as an iconic image. 

Iconic images have been used in computer-implemented 
applications to augment and exploit human memory and 
pattern matching skills. Story et al. f in 'The RightPages 
Image-Based Electronic Library for Alerting and 55 
Browsing " in IEEE Computer, September 1992, pp. 17-26, 
discloses a prototype electronic library that provides certain 
library services to its users. A user interface shows an image 
area including "stacks" containing reduced-size images of 
journal covers that users can view in a way analogous to go 
viewing journal covers on library shelves. To examine the 
contents of a particular journal, the user selects a journal 
with a mouse, and the system displays an image of the table 
of contents. In addition to saving display space, the use of 
thumbnail image versions of the journals* covers exploits 65 
the user's familiarity with the appearance of the covers of 
publications in a particular field of science. 
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Mark Peaks in 'Iconic Paper" in Proceedings of the 
International Conference on Document Analysis and 
Recognition, Montreal, Canada, 1995, pp. 1174-1179, dis- 
closes a technique that uses thumbnail images, referred to as 
icons, to retrieve documents from an electronic database. 
The technique provides a physical sheet of paper as a 
representation that can be used by humans for recognition 
and by machines for indexing. A document can be accessed 
by a gesture indicating a particular icon on the page. The 
technique exploits the pattern matching abilities of the 
human user using page characteristics of the original image 
that are still identifiable at the selected reduction scale. To 
employ the pattern recognition method, a text retrieval 
operation uses character counts of each word of text in 
document images to index a table of document identifiers 
that can then be used to locate the original page or document 
in a data base. Character positions in an original page of text 
are determined and a special pixel pattern is positioned on a 
one-for-one basis in the icon in place of each character in 
order to preserve the ability to compute character counts of 
words in the reduced version of the image. During a retrieval 
operation, an iconic image is selected by a user, the special 
pixel patterns are located in the iconic image, counts of the 
lengths of words are made, and the counts converted to an 
index in the table for retrieving the original image. 

Image marking and encoding techniques are used in a 
wide variety of applications to insert, or embed, encoded 
information into an image; the embedded information is then 
subsequently decoded and used for a variety of purposes, 
some of which include carrying out tasks that can be 
generally classified as document image management tasks. 
Bar codes explicitly insert encoded information in an image, 
and are typically used in applications where the obvious and 
perceptible presence of the encoded information is not a 
disadvantage. For example, U.S. Pat. No. 5,506,697 dis- 
closes applications for encoding the machine-readable or 
human-readable content of an original document in one or 
more two-dimensional bar code symbols; the symbol with 
the original document, or just the symbol, may then be 
facsimile transmitted to a remote location where the original 
document may be regenerated using data decoded from the 
facsimile-transmitted bar code symbol. 

The field of innocuous, or surreptitious, image marking is 
known as steganography, or "covered writing.'* Data glyph 
technology is a category of embedded encoded information 
that is particularly advantageous for use in applications that 
require the embedded data to be robust for decoding pur- 
poses yet inconspicuous, or even surreptitious, in the result- 
ing image. Data glyph technology encodes digital informa- 
tion in the form of binary l's and O's which are then 
rendered in the form of very small linear marks. Generally, 
each small mark represents a digit of binary data; whether 
the particular digit is a digital 1 or 0 depends on the linear 
orientation of the particular mark. For example, in one 
embodiment, marks which are oriented from top left to 
bottom right may represent a 0, while marks oriented from 
bottom left to top right may represent a 1. The individual 
marks are of such a size relative to the maximum resolution 
of a printing device as to produce an overall visual effect to 
a casual observer of a uniform gray halftone area when a 
large number of such marks are printed together on paper, 
and the halftone area in the document, when incorporated in 
an image border or graphic, does not explicitly suggest that 
embedded data is present A viewer of the image could 
perhaps detect only by very close scrutiny that the small dots 
forming the gray halftone area are a series of small marks 
which together bear binary information. U.S. Pat Nos, 



10/23/2003, EAST Version: 1.4.1 



5,765, 

3 

5,091,966, 5428,525, 5,168,147, 5,221,833, 5,245,165, 
5,315,098, and 5,449,895, and U.S. patent application Ser. 
No. 07/560,514, all assigned to the assignee of the present 
invention, provide additional information about the uses, 
encoding and decoding techniques of data glyphs. For 5 
example, U.S. Pat. No. 5,315,098, entitled "Methods and 
Means for Embedding Machine Readable Digital Data in 
Halftone Images/' discloses techniques for encoding digital 
data in the angular orientation of circularly asymmetric 
halftone dot patterns that are written into the halftone cells 10 
of digital halftone images, and U.S. Pat. No. 5,168,147 by 
the named inventor herein and entitled "Binary Image 
Processing for Decoding Self-Clocking Glyph Shape 
Codes" discloses image processing techniques, including 
image morphology techniques, for decoding glyph codes 15 
embedded in scanned images. 

U.S. Pat. No. 5,486,686, assigned to the assignee of the 
present invention and entitled "Hardcopy Lossless Data 
Storage and Communications for Electronic Document Pro- 
cessing Systems," discloses an improvement to an electronic 20 
document processing system for transferring information 
back and forth between an electronic domain and a hardcopy 
domain. An interface means is provided between a computer 
that operates on and stores electronic document files and a 
printing device, where the printing device prints on a hard- 2 s 
copy document both the human readable renderings of an 
electronic document and machine readable attributes of the 
electronic document The machine readable attributes are 
recoverable from the code printed on the hardcopy docu- 
ment when information carried by the document is trans- 30 
formed from the hardcopy domain to the electronic domain, 
such as for example by scanning the physical document 
Data glyphs are disclosed as a way of encoding the machine 
readable attributes of the electronic document on the hard- 
copy document It is disclosed by way of example that all or 35 
only selected portions of the ASCII content of the electronic 
document, the document description language definition of 
the electronic document, or the printer description language 
definition of the document may be printed on the hardcopy 
document. When a sufficient amount of information is 40 
encoded, the physical document serves as a lossless data 
storage mechanism for the electronic document. 

U.S. Pat No. 5,444,779, assigned to the assignee of the 
present invention and entitled "Electronic Copyright Roy- 
alty System Using Glyphs" discloses a system for utilizing 45 
a printable yet unobtrusive data glyph or similar two- 
dimensionally encoded mark to identify copyrighted docu- 
ments for the purpose of collecting copyright royalties or 
preventing document reproduction upon decoding of the 
data glyphs or other encoded marks. Id connection with a so 
processor-controlled copier, printing system or advanced 
reprographic apparatus, the decoded information may be 
used, for example, to indicate that a copyright fee is payable, 
or to prevent printing or reproduction of the document. In 
one embodiment, the data glyph coding indicating the copy- 55 
right royalty and reproduction information may be updated 
and newly printed, or new information may be added, on the 
copy or copies produced by the printing system or repro- 
graphic apparatus. 

U.S. Pat No. 5,060,980, assigned to the assignee of the 60 
present invention and entitled "Form Utilizing Encoded 
Indications for Form Field Processing,"disclose$ a novel 
type of form document that may include one or more of a 
variety of types of fields, as well as other non-field infor- 
mation and that carries an encoded description of itself. A 65 
form generation process automatically encodes information 
about the fields as the form is being created, and integrates 
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the encoded information into the electronic and printed 
representations of the form. The particular encoding scheme 
used may be any suitable encoding scheme that is able to 
encode the information compactly enough to allow room on 
the form for the fields themselves. A form interpreter accepts 
information from a scanner, locates the encoded information 
representing the form description in the scanner information, 
interprets the encoded information, and performs either 
preprogrammed operations on the information located in 
specified fields or of a specified data type, or performs 
operations on that information from instructions encoded 
with the form description itself. 

A related practice in the field of steganography is that of 
image marking, sometimes referred to as "digital 
watermarking," analogous to the practice of marking paper 
with a largely indiscernible design during manufacture. In 
document marking applications, one or more codewords are 
embedded in a document image in a manner that is substan- 
tially indiscernible to a reader but can be reliably recovered 
and decoded. For example, using the least significant bit of 
each pixel in an eight bit per pixel grayscale image to encode 
a message would cause little or no impact on the appearance 
of the image, yet a 480 pixel wide by 100 pixel high image 
could theoretically contain a message of more than 5,000 
characters. The same principles apply to audio and video 
files as well. Moreover, the image can be used simply as a 
carrier, with the message first being encrypted. In many 
applications, the goal of concealment of the encoded data is 
typically an important one in those applications in which the 
document is being marked so that it may be traced or 
authenticated. 

An example of image marking is disclosed in U.S. Pat 
No. 5,278,400, assigned to the assignee of the present 
invention and entitled "Multiple Threshold Encoding of 
Machine Readable Code." U.S. Pat. No. 5,278,400 discloses 
a method and apparatus for applying coded data to a 
substrate and decoding the data where the data are encoded 
in uniformly sized groups of pixels, called cells. Each cell is 
encoded by distinctively marking a certain number of the 
pixels to represent the code, without regard to the position 
in the cell of a marked pixel. For example, a cell comprised 
of six pixels each of which may be marked in black or white 
provides for seven possible black-white combinations of the 
pixels in the cell; a series of three cells provides for 7 3 
possible coded combinations, more than enough to encode 
the 256 character ASCII character set with only 18 pixels. 
The characteristics of the marking of each cell are preferably 
the same to facilitate robustness for decoding purposes. 

Digimarc Corporation of Portland Oregon markets soft- 
ware for embedding information such as electronic signa- 
tures or serial numbers directly within photographs, video, 
audio, and other creative property. According to company- 
published information available as of the filing date of this 
application at www.digimarc.com entitled "Digimarc Corp 
Announces New Copyright Protection Technology", Jun. 
27, 1995 (hereafter, <4 the Digimarc press release"), a Digi- 
marc™ signature may be used for a wide variety of 
applications, including verifying copyright ownership, 
detecting alterations, triggering digital-cash meters, or track- 
ing black-market distribution. The Digimarc press release 
discloses that the technology combines the information to be 
embedded with a random code pattern, and this combined 
signature is then added to the digitized image (or other 
creative property) as a very low signal level to produce an 
invisibly signed image, undetectable to the human eye and 
ear. The Digimarc press release discloses that, since the 
signature is melded with the image itself, it survives con- 
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versions from digital to analog and back. An encoded 
signature can be found by a computer analysis using the 
creator s unique code pattern. The Digimarc press release 
further discloses that, without this code, the signatures are 
impossible to detect or remove. The signatures are 5 
holographic, meaning that the entire signature is contained 
in a small section of an image, or in a tiny clip of music. The 
signatures are robust, so they can survive multiple genera- 
tions of copying, transforming, printing, scanning, or com- 
pression. The Digimarc press release further discloses that 10 
Digimarc signatures can identify ownership and other infor- 
mation about an image or other intellectual property without 
completely locking out access (such as encryption does), 
being separated from the image (as file headers can be) or 
damaging the image (as watermarking, thumbnails, or is 
reduced resolution versions do). 

Document marking can also be achieved by altering the 
text formatting in a document, or by altering certain char- 
acteristics of textual elements (e.g., characters), in a manner 
that is both reliably able to be decoded even in the presence 20 
of noise and that is largely indiscernible to a reader. Brassil 
et al., in "Electronic Marking and Identification Techniques 
to Discourage Document Copying" in IEEE Journal on 
Selected Areas in Communications, Vol. 12, No. 8, October 
1995, pp. 1495-1504, disclose three techniques for embed- 25 
ding a unique codeword in a document that enables identi- 
fication of the sanctioned recipient of the document while 
being largely indiscernible to document readers, for the 
purpose of discouraging unauthorized document distribu- 
tion. The image coding schemes were designed to be attack- 30 
resistant by ensuring that substantial effort would be 
required to remove the document encoding, and that suc- 
cessful removal of the encoding would result in a substantial 
loss of document presentation quality. The techniques dis- 
closed include line shift coding, word shift- coding and 35 
feature coding. Use of these techniques in the resulting 
image is typically not noticeable to a viewer of the image, 
and text in the image is not substantively altered. With 
respect to line shift coding. Brassil et al. disclose that each 
intended document recipient is preassigned a unique code- 40 
word that specifies a set of text lines to be moved in the 
document specifically for that recipient The codeword is 
decoded by performing image analysis on a copy of the 
document to detect the moved lines and reconstruct the 
codeword to identify the authorized recipient 45 

Many document image management operations of the 
kind just described make use of encoded information to 
perform their tasks. Some of these techniques may be 
deficient for certain types of applications. The techniques 
that rely on encoding the information such that it is incor- 50 
porated somewhere in the full-size version of the document 
image may substantially alter the presentation of the docu- 
ment or cause the document to be aesthetically unappealing. 
Explicit bar code encoding of information draws attention to 
the fact that encoded information is carried on the document 55 
and may also be aesthetically inappropriate. The technique 
of line shift coding disclosed by Brassil claims to be 
indiscernible but it would appear that the amount of infor- 
mation that may be encoded is minimal and may be insuf- 
ficient for other types of tasks. 60 

SUMMARY OF THE INVENTION 
The present invention is premised on the observation that, 
for many applications, techniques for embedding informa- 
tion unobtrusively in an image can be combined with the use 65 
of an iconic image representation in order take advantage of 
the benefits of encoding useful information in an indiscern- 
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ible manner while retaining the ability to exploit human 
pattern matching capabilities for certain types of document 
image management applications where such capabilities 
provide leveraged functionality. In addition, the iconic 
image serves as a useful mechanism for inconspicuously 
embedding digital information in images in any document 
image management task where the presence of an iconic 
image is provided as a surrogate for a full-sized version of 
an image, regardless of whether the iconic image is specifi- 
cally used for the purpose of providing clues for recognizing 
the full-sized image. 

The iconic image includes embedded encoded data in a 
position where the reduced version of text in the original text 
image would appear, and are rendered as a series of rect- 
angular blocks. At the reduced size, these rectangular blocks 
appear as straight lines and have the appearance of 
"greeked" text, a technique that is used to replace the 
rendering of actual text when rendering actual text reduces 
performance or efficiency of an operation. Thus, a viewer of 
the iconic image who is unable to see a reduced version of 
the text is not likely to interpret the "greeked" text as a signal 
of the presence of embedded data, but is more likely to 
interpret it as a normal consequence of the image reduction 
operation. 

The encoding operation may implement any suitable 
encoding scheme that produces rectangular blocks that have 
a foreground color and that have size dimensions propor- 
tional to the iconic image so that when placed in the iconic 
image, the rectangular blocks appear to a viewer to be 
representative of a text portion of the original image. A 
significant advantage of the encoding operation that creates 
the iconic image is that the message carried by the binary 
data and the resulting rectangular blocks may be any infor- 
mation suitable for a particular document image manage- 
ment task, and need not be restricted to a reproduction of, or 
information about, the text in the original image that the 
encoded data replaces. The encoded binary data may be 
referred to as "arbitrary" binary data, in the sense that the 
message of the encoded data need bear no relationship to any 
text included in the full size image nor to any information 
about the full size image. When the iconic image is used in 
a document image management task, the encoded binary 
data provides input information required to perform the task. 

The encoding operation is designed to be robust for 
decoding purposes to permit reliable and accurate recovery 
of the encoded information regardless of the resolution at 
which a copy of the iconic image is subsequently rendered. 
The use of rectangular blocks that approximate the size of 
words in text to contain the encoded data provides a sig- 
nificant advantage in robustness and reliability of decoding: 
rectangular blocks are relatively straightforward to detect 
reliably in image segmentation operations, and are likely to 
suffer less from the problem of touching components than 
would the use of character-by-character encoding. Rectan- 
gular blocks are also robust for applications in which the 
iconic image is printed or scanned using low resolution 
devices that may introduce noise or distortion into the image 
data representing the iconic image. 

The iconic image may be rendered and printed in a 
suitable place (e.g., the margin or other background region) 
in the original text image; a text image annotated with an 
iconic image of the type produced by the invention will then 
always carry the embedded data in subsequent copies made 
from the annotated original. The iconic image alone may 
also be used in printed form or as part of a graphical user 
interface as a surrogate for the original text image in a 
variety of applications. 
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In particular, the iconic iraage may be used in a variety of the same numbers have been used to denote the same 

document image management applications to direct or con- component parts or steps. The description of the invention 

trol an operation to be performed on or with the original text includes certain terminology that is specifically defined for 

image represented by the iconic image. The information describing the embodiment of the claimed invention illus- 

needed to perform the document management task is embed- 5 trated in the accompanying drawings. These defined terms 

ded in the iconic image and a decoding operation produces have ^ me anings indicated throughout this specification 

the binary data required by the document image manage- md in mc claims , rather than mcanings thatmay occur 

ment operation ,to perform the document task. As wiU be ^ cther sources , sucn as , for aocun * nts ;if ^ 

seen in the lifted examples below, the variety of docu- ^ „ ^ 0rate4 5y r e fer enceherek elsewhere in W 

ment image tasks that may make use of the iconic image description 
with the embedded, encoded binary data is limited only by 

imagination. For example, an operation may make use of the BRIEF DESCRIPTION OF THE DRAWINGS 
iconic image as a surrogate for the original text image 

because of the visual similarity between the iconic image FIG. 1 is a flowchart illustrating the general operation of 

and the original text image; this takes advantage of percep- 15 the invention for performing a document image management 

tual clues about the original text image in order to retrieve task using decoded binary data in an iconic image version of 

and process a specific original text image in some way a text image according to the invention; 

specified by the operation using retrieval and processing FIG. 2 illustrates a chart showing possible configurations 

information embedded in the iconic image. Or the iconic of various uses of an iconic image combined with various 

image may be used as a type of "document token", contain- 2Q types of data that may be encoded in the image, far carrying 

ing embedded and encoded information related to certain out a document image management task according to the 

document image management operations that may be per- present invention; 

formed on one or more documents in a class or genre of na 3 mstia ^ s ^ original text image and its iconic 

documents that have certain features m common, such as ^ version which is a suitable ^ t ^ (0 ^ t 

their content or visual appearance. 2J mvent ion- 

^important advantage of the present invention is the FIG. 4 illustrates a representative text image with the 

. abmtytousetheicomcmiage obndge me digital and paper iconic yersion of me ^ ^ J m of ^ 

worlds while the encoded data is robustiy preserved; the ^ ^ m ^ ^ fi e ^ f ^ 

iconic miageserves as an ^conspicuous storage mechanism ^ ^ form of ^ data including iconic image 

worfds g S n 30 *«* as^msuitable to the present invention; 

W The^efore, in accordance with one aspect of the present - ^ illustrates a representative hardcopy text document 

invention, a method is provided for operating a processor- sh ^ s ? veral 1C0 ^ 1C ■f** ^ d ™ ca P^ e 

controlled machine to perform a document image manage- f d sd f . tion f°r Providing a second exemplary 

ment operation using an iconic version of a text image- ™e 35 ^ m ? lU<hn ? 1C ° mC un,ge aS mpUt 

. ■ • i j * -i £ • • suitable to the present mvention; 

machine includes an image signal source for receiving ^ ' 

image data; memory for storing data; and a processor FI 9* 6 ^"s^tes the display area of a display device 

connected for accessing instruction data stored in the showing examples of iconic versions of text images, and a 

memory for operating the machine; the processor is further third exemplary form of iconic image data as input suitable 

connected for receiving image data from the image signal 40 t0 me Present invention; 

source; and connected for storing data in the memory. The FIG. 7 illustrates providing iconic image data and text 

method comprises receiving image definition data defining document image data as inputs derived from separate 

an input iconic image from the image signal source. The sources and provided to a document image management 

input iconic image has an appearance to a human viewer of operation according to the present invention; 

a reduced-size version of an original text image, and 45 FIG. 8 illustrates an example of binary data that has been 

includes at least one image region including image definition encoded in an iconic image and that may be decoded 

data defining a plurality of rectangular blocks each having a according to the present invention; 

foreground color and arranged in horizontal lines separated FI G . 9 illustrates an enlarged portion of the iconic image 

vertically by image regions of a background color. showing encoded binary data in the form of rectangular data 

Arranged in this manner, the rectangular blocks have the so blocks* 

appearance of representing text in the original text image. nG ' ,„ scnematlcaU mustraIes several characteristics 

The unage definition date defining each rectangular block, Mdpropertiesof ^ encoded data blocks showri enlargedin 

referred to as an encoded data block, has a characteristic fig 9' 

property that represents a portion of the binary data. The . a , . 
processor performs a decoding operation using the image 55 , *f 1S a flowchart mustratm g mc g eneral operation of 
definition data defining the plurality of rectangular blocks to me ^coding operation of the present invention; 
produce the binary data encoded therein. The method then FKj. 12 illustrates a structuring element used in a mar- 
includes performing a document image management opera- phological operation to identify encoded blocks in an iconic 
tion using the binary data produced by the decoding opera- image, according to the decoding operation illustrated in 
tion. 60 nG ' u ? 

Hie novel features that are considered characteristic of the FIG. 13 is a block diagram illustrating a data base filing 
present invention are particularly and specifically set forth in operation that includes an encoding operation to produce an 
the appended claims. The invention itself, however, both as iconic image version of a text image document; 
to its organization and method of operation, together with its FIG. 14 is a flowchart illustrating an embodiment of the 
advantages, will best be understood from the following 65 present invention for performing a data base retrieval opera- 
description of an illustrated embodiment when read in tion to locate a text image using location identification data 
connection with the accompanying drawings. In the Figures, encoded in an iconic image version of the text image 
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according to the operation shown in FIG. 13, and decoded The input iconic image in combination with the encoded 

according to the present invention; binary data may each be flexibly specified so as to be 

FIG. 15 is a block diagram that illustrates a word pro- capable of being used in a wide variety of document image 

cessing operation that includes an encoding operation to management operations. FIG. 2 illustrates chart 450 which 
produce an iconic image version of a text image document; 5 shows a taxonomy of the various combinations of iconic 

FIG. 16 is a flowchart illustrating an embodiment of the specification and type of information indicated by the 

present invention for performing a character recognition enooded binar y TbtM combinations will be discussed 

operation to produce a transcript of a text image using data with more as 8 c neral operations of the present 

encoded in an iconic image version of the text image mention and the data they operate on or produce are 
according to the operation shown in FIG. 15, and decoded 10 <k scnbed in more detail Wow. 

according to the present invention; L ^P* Iconic Havm & Encoded Binary Data. 

ETr» • * | j • .„ . ^ j- ■* i • *. a - Tne iconic image as a reduced-size version of an 

FIG. 17 is a block diagram illustrating a digital signature nri . 1 text ^ 

reduction operation and an iconic image encoding opera- previously noted, an iconic image is a reduced sized 

T, P ltT "t .™ °H " is version, or representation, of an ad&ud text image that 

document that contains a digital signature of the document u * L n. • « 4 , * c 

& ^ substantially preserves visually significant page layout fea- 

Ua ^tti' * « . „ , , tures of the full size version of the original image, The term 

FIG. 18 is a flowchart illustrating an embodiment of the "original text image" will be used to refer to the image 

present invention for performing a digital signature verifi- definition data that defines the full-size version of the text 

cation operation to authenticate an input text image using M ^ ^ represented by a particular iconic image. An 

digital signature data encoded in an iconic image version of original text image includes at least one image region 

the text image according to the operation shown in FIG. 17, composed of image definition data defining images of 

and decoded according to the present invention; characters, generally collectively referred to as text; As used 

FIG. 19 is a flowchart illustrating a variation of the herein, a "character" is a single, discrete, abstract element or 

embodiment of the present invention shown in FIG. 18, 25 symbol and includes not only alphabetic and numerical 

where the iconic image and the original text image are elements, but also punctuation marks, diacritical marks, 

received from separate sources; mathematical and logical symbols used in mathematical 

FIG. 20 is a simplified block diagram illustrating a notation such as equations, and phonetic, ideographic, or 

machine in which the present invention may be used; and pi olographic elements. For example, symbols in picto- 

FIG. 21 is a block diagram schematically illustrating the 30 graphic languages and symbols representing musical nota- 

software product of the present invention and its use in tion are included in the term character. A sequence of 

conjunction with a suitably configured machine. characters forms a "text" or "string". Image definition data 

defines a text image when a plurality of character images 

DETAILED DESCRIPTION OF THE occur in the space defined by the image. Text images of the 

INVENTION 35 type suitable as being represented by iconic images accord- 

a r> ^ «• « i> . ing to the present invention are assumed to be rectangular, 

A. General Operation of the Invention A4 . u • i* * * . • 

r and to have an image coordinate system in which x increases 

FIG. 1 is a flowchart illustrating document image man- in a horizontal direction to the right, y increases in a vertical 

agement operation 200 of the present invention which uses direction downward, and x=y=0 is at the upper left comer of 

an iconic version of a text image having binary data encoded 40 the image. An image location is given by a set of image 

therein. The input data needed to carry out operation 200 is coordinates, (x, y). Each location in an image may be called 

received in box 210 and includes image definition data a "pixel." In an array defining an image in which each item 

defining an input iconic image having at least one image of data provides a value, each value indicating the color of 

region composed of image definition data defining images of a location may be called a "pixel value". Each pixel value is 

rectangular blocks arranged in horizontal lines. Exemplary 45 a bit in the "binary form" of the image, a grayscale value in 

iconic image input sources 54 and 56 are shown in FIG. .1, a "grayscale form" of the image, or a set of color space 

and are described in more detail below, in the discussion coordinates in a "color coordinate form" of the image, the 

accompanying FIGS. 4, 5 and 6. The binary data is decoded binary form, grayscale form, and color coordinate form each 

from the rectangular blocks, in box 300, and the decoded being a two-dimensional array defining the image. 

data is then used to perform a document image management 50 FIG. 3 shows original text image 10 that includes an 

operation, in box 400. The term "document image manage- image region 12 that includes textual headings, image 

ment operation" refers to any processor-controlled operation regions 14 and 18 that include paragraphs of text, an image 

that uses or produces a text image, including, but not limited region 16 that includes a graphical object and image region 

to, searching for, retrieving, and displaying a text image, 17 indicating another text area that can be seen to indicate 

where displaying includes presenting the image using any 55 a caption for the figure that is the graphical object in image 

type of marking device; performing character recognition on region 16. It can be seen that original text image 10 has a 

a text image to produce a transcription of the text in the characteristic page layout appearance that includes specific 

image; performing a copying, distribution or transmission margin dimensions, specific placement of a footer with a 

operation using the text image; and performing an authen- . page number, distinctive spacing between paragraphs and of 

tication operation to authenticate the text image as to its 60 lines within paragraphs, the absence of text justification at 

content and/or sender. Any one of these document image the right margin, and text that appears in various font sizes, 

management operations may require additional input data, which together contribute to producing a distinctive overall 

which is shown as an optional input source in FIG. 1 as data visual appearance. 

file or data structure 58 illustrated as having a dashed line FIG. 3 also shows iconic image 20, a reduced size version 

outline. Similarly, the optional data needed by document 65 of original text image 10, that is input to the technique of the 

image management operation 400 may also include the text present invention. It can be seen mat iconic image 20 has 

document image portion of iconic image source data 54. preserved the distinctive overall visual appearance of origi- 
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nal text image 10, such that heading 12 in original image 10 category 456 in the taxonomy of FIG. 2. The operations of 

is visible in a reduced size as heading 22 in iconic image 20, scanning document 188 and decoding the embedded data in 

a reduced- size version 25 of FIG. 16 is positioned in a respective one of the iconic images selected according to 

proportionally the same position in iconic image 20 as it is a selection operation provide iconic image data 50 for input 

in original image 10, and text regions 14 and 18 in original 5 to operation 400 of FIG. 1. Iconic image data 50 includes the 

image 10 are represented in iconic image 20 as regions 24 encoded data about the document the iconic image repre- 

and 28, respectively, of horizontal lines of rectangular sents to a processor-controlled machine for further use. 

blocks. Alternatively, each iconic image rendered on document page 

With reference to the taxonomy in FIG. 2, iconic image 20 188 may be representative of a genre or class of documents, 

may either represent a content-specific original text image 10 or to a database of documents, and a respective one of the 

452, or it may represent a non-specific text image 458. A iconic images on document page 188 may be used to provide 

content-specific text image refers to a particular document access to the class of documents by means of the encoded 

image that is identifiable on the basis of its contents, such as, data embedded therein; in this latter case, these iconic 

for example, "Bob's July XYZ project report" or "Ann's images are examples of category 458 in the taxonomy of 

1995 journal article on Topic ABC"; typically the iconic u FIG. 2. 

image of a content-specific original text image will have FIG. 6 illustrates the display version 21 of an iconic 
preserve the distinct visual appearance of original image. In image, rendered from iconic image data 56 and displayed in 
contrast, a non-specific text image refers to a genre or class display area 180 of device 170. Iconic image 21 may be one 
of document images, such as, for example, the class of of many displayed iconic images (not shown) in display area 
document images that are project reports" or "project 20 180, and may be an iconic image of either category 458 or 
reports on XYZ project", or the class of document images 456 of FIG. 2. A displayed iconic image is available for 
that are "journal articles'* or the class of document images direct manipulation by a user who is able to manipulate 
that are "journal articles on Topic ABC". A content-specific cursor 184 to select or otherwise interact with any one of the 
iconic image maps to a single document, while a non- displayed iconic images using a direct manipulation device 
content specific document may map to one or more docu- 25 such as a mouse. Keyboard and stylus devices are also 
ments. suitable direct interaction devices. In response to a user's 
With further reference to the taxonomy of FIG. 2, when interaction with displayed iconic image 21 to carry out a 
iconic image 20 represents a content-specific original text request to perform a function, iconic image data 56 is 
image 452, iconic image 20 may be used in a document provided to operation 200 of FIG. 1, and a decoding opera- 
image management operation in conjunction with the arigi- 30 tion such as the one described below in the discussion 
nal text image it represents, as indicated by entry 454 in accompanying FIG. 11 can extract and decode the embedded 
chart 450, or it may be used in the absence of, or in place of, data, which may provide information with which to carry out 
the content -specific original text image, as indicated by the user's request 

entry 456. In this latter use, an iconic image may be referred The iconic image data received in box 210 of the flow- 
to as a "document surrogate." 35 chart of FIG. 1 that illustrates the general operation of the 

FIG. 4 illustrates an example of iconic image category present invention may be derived from any one of the 

454 in FIG. 2: printed fullsize text image 182 in conjunction sources of document image 182 of FIG. 4, document image 

with iconic image 20, representing text image 182, printed 188 of FIG. 5, or display area 180 in FIG. 6. 

in the lower margin of the document Once printed, subse- When a particular document image management opera- 

quent copies of the hardcopy printed document will carry 40 tion 200 requires both text document image data and iconic 

iconic image 20 and its embedded, encoded data, which may image data, these may be provided from separate sources, 

be accessed via the present invention to provide information Thus, iconic image data may be provided as input from 

for use in performing a document image management opera- either a version of the iconic image produced by an image 

tion. capture operation, as shown in FIG. 5, or from a displayed 

b. Sources of iconic image input data. 45 version, as shown in FIG. 6, and the text document image 

FIGS. 4, 5 and 6 illustrate several alternative input data data may be provided from a separate image data capture 

sources for providing the iconic image data to document source. FIG. 7 illustrates an example of dual-source input, 

image management operation 400 of FIG. 1. FIG. 4 illus- where the iconic image data is provided separately from the 

(rates the situation when iconic image 20 is printed as part image data of the fullsize original text image that the iconic 

of a printed, hardcopy version 182 of an original image 10 so image represents. 

(FIG. 3) which is intended to be input to operation 400 of 3. The Encoded Binary Data. 

FIG. 1. Document 182 is provided to image data capture a. What information may be encoded 

operation 510 which produces a file or data structure 54 of FIG. 8 illustrates a representative sample of binary data 70 

the text document image data that includes iconic image that may be encoded in an iconic image that is input to the 

data, which is then provided as input data to operation 400. 55 present invention. Binary data 70 is not restricted in any way 

A suitable image data capture operation is a scanning as to the nature of the information it may convey, and may, 

operation or the like. for example, represent character symbols in a language 

FIG. 5 illustrates a hardcopy document page 188 with a using ASCII or UNICODE character encoding, or the corn- 
collection of iconic images rendered on the page. Iconic pressed or encrypted form of such symbols. Binary data 70 
images 20 and 186 in particular are shown in detail. When 60 may also indicate instruction data of the type used to operate 
these iconic images are used to represent full size versions a processor that controls a machine having the configuration 
of original text documents, this single hardcopy document of machine 100 in FIG. 20. Examples of such machines 
can serve as a type of physical storage device for the include a computer, printer, scanning device or facsimile 
encoded information about the particular documents repre- device, or a machine that combines these functions, 
sented; each iconic image shown on document page 188 is 65 Examples provided below of the type of information that 
a document surrogate for the full-size text document (of one may be represented by binary data 70 are not intended to be 
or more pages) that it represents, and is an example of exhaustive or limiting in any manner. 
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The information represented by binary data 70 may be would then be able to re-render a black and white version of 

generally classified as shown in chart 450 of FIG. 2; that is, the full size image in color. 

the encoded binary data may indicate information related to In a further example, the iconic image may be used as part 
a content-specific original text image, noted by reference of a system that controls authorized distribution or use of the 
numeral 464, or it may indicate data related to a document 5 full sized text image, or of the electronic document form of 

management operation, denoted as reference numeral 460, the image, represented by the iconic image. An iconic image 

or it may be a combination of those two classes of data, may contain encoded data indicating the functions that the 

denoted as reference numeral 468, Data related to a docu- possessor or viewer of the iconic image is authorized to 
ment management operation may be generally characterized perform with the full sized image. In another example, the 

as operation parameter data that the operation uses to carry io iconic image may serve as part of a document indexing or 

out its functions. browsing function or application. An iconic image of a table 

The information represented by binary data 70 may be of contents or of a bibliography could provide identifying 

directly related to the content of the full-size text image reference informauon, such as a hypertext reference, that 

represented by the iconic image in which it is encoded; would provide automatic access to a specific portion of a 

binary data 70 may represent, for example, all or part of the 15 document or to a document referenced in the contents or 

actual text included in the represented full-size text image bibliography. 

obtained by performing a character recognition operation on b. Overview of encoding data into rectangular blocks, 

the full-size text image, or obtained from some other source, FIGS. 9 and 10 illustrate some general principles about 

such as the word processing data structure which was used the appearance of the encoded data in an iconic image. An 

to produce the full-size text image. 20 encoding operation maps binary data of the type represented 

Alternatively or in addition to the content-related by binary data 70 in FIG. 8 to rectangular blocks, referred to 

information, binary data 70 may represent certain informa- as encoded data blocks, having certain appearance 

tion about the represented full-size text image, such as characteristics, properties and features, generally called 

identifying information. Information about the document "display features," which may vary according to the appli- 

may be generally referred to as "meta-data." For example, 25 cation in which the iconic image is being used. The term 

binary data 70 may represent character encoded information "display feature" refers to any human perception produced 

indicating the URL (Uniform Resources Locator) of a loca- by a display device, and includes a single display feature and 

tion on the World Wide Web, where a representation of the also may include plural display features that together form 

full-size text image may be found. Such a representation a pattern of display features in an image/Thus, interblock 

may include: the image definition data structure of the text 30 spacing, block height, block length and interline spacing are 

image itself; a file containing the formatted data for display all perceptible display features of the encoded data blocks, 

of the text image on a computer using a browser program; and the image definition data defining each rectangular 

a file from which the full-size text image was produced; a block has a characteristic property that represents a portion 

file containing a simple ASCII representation of the docu- of the binary data. FIG. 9 shows iconic image 20 with 

ment from which the full-size text image was produced; or 35 portion 30 enlarged; portion 30 includes horizontally 

a file containing a simple, formatted ASCII version of the positioned, linear sequences of rectangular blocks. In 

full-size text image. In another example, binary data 70 may general, binary data of the type shown by example in FIG. 

indicate attribute information about the full-size text image, 8 and encoded in portion 30 of iconic image 20 is encoded 

such as a time stamp indicating the version date and time of into rectangular blocks having a foreground color, depend- 

a word processing file from which the full-size text image 40 ing on the particular application for which the iconic image 

was produced, or the date and time the iconic image was is to be used, the foreground color may be, but need not be, 

produced or printed, or any other date or time information compatible with the foreground color of the text in the 

that was made available during the process of producing the original text region that the rectangular blocks replace. FIG. 

iconic image. Additional examples of attribute information 10 shows a more detailed view of portion 30. For each 

include input text image owner identification information; a 45 simulated line of text in the iconic image, a sequence of 

digital signature verifying the authenticity of the iconic encoded data blocks, containing encoded data are placed 

image or of the data from which the iconic image was horizontally in the iconic image along a baseline such as 

derived; and identification information about the printer on baseline 32 and are horizontally spaced along the baseline 

which the iconic image containing the encoded binary data by regions of background color, labeled in FIG. 10 as 

was printed. 50 interblock spacing. In FIG. 10, the foreground color is 

Binary data 70 may also include information generated by represented as having a pixel value of "1" and the back- 

the encoding operation that encodes the binary data into the ground color has a pixel value of "0", as is conventionally 

iconic image, or by an image reduction operation or by some the case for representing black and white images. When an 

other operation that provides assistance to decoding opera- application's requirements demand that the encoded data 

tion 300. Such information might include cyclic redundancy 55 blocks simulate text as closely as possible, interblock spac- 

code (CRQ data, error correction code (ECC) data, or ing should be roughly the same distance > or give the appear- 

information about the data being encoded, such as, for ance of being roughly the same when the iconic image is 

example, the number of lines of encoded data, the number of printed or displayed, and the interline spacing should be 

bytes of encoded data, or the number of encoded data blocks proportional to the interline spacing in the original image, 

that are included in the iconic image. 60 How faithfully the original text being replaced by the 

Binary data 70 may also indicate specific information encoded data blocks needs to be simulated is a function of 

related to a function or application with which the iconic the needs of the application using the iconic image, and the 

image is associated or used. For example, the iconic image blocks need not replace words and lines on a one-for-one 

may contain color information specifying colors for regions basis. The height of each rectangular block, labeled as block 

or portions of the full size image when the full size image is 65 height in FIG. 10, is generally proportional to other page 

printed. A color digital copier (or a binary scanner and color layout dimensions in the iconic image, or to the height of the 

printer) equipped with the ability to decode me iconic image text ininput text image 10; the blockheight may be uniform, 
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as shown in FIG. 10, but need not be, if a suitable encoding 
operation is found that encodes data using the height dimen- 
sion of the block. 

FIG. 10 shows the encoded data blocks as having varying 
block lengths. Generally, the length of the blocks should be 
somewhat random or at least vary in a pattern that gives the 
appearance of the actual text in the original text image 
represented by the iconic image. The characteristics of the 
word lengths in the language represented in the original text 
image may influence the selection of an encoding scheme, 
and an encoding scheme that produces an aesthetically 
pleasing pattern of encoded data blocks for replacing text in 
one language may not be particularly suited to representing 
text in a different language with markedly different word 
length characteristics. 

A second stream of data may be encoded in the back- 
ground color regions that serve as interblock spacing 
between the encoded data blocks by using different length 
background color regions. In the simplest encoding, one bit 
is encoded using short and long background color regions to 
separate encoded data blocks. The encoded bit can be used 
as "parity" for error detection on the previous encoded data 
block. A set of such bits can also be used for error detection 
or correction on the message that is encoded in the encoded 
data blocks. 

In some encoding operations, the lines of encoded data 
blocks may need to be positioned so as to begin and end 
consistently at the same respective locations on a horizontal 
line; consistent line beginning positions, at the left margin, 
are generally expected in text documents, and are straight- 
forward to implement. Consistent line ending positions for 
the encoded data blocks may be preferred when the text 
being replaced in an original text image is justified at the 
right margin. In addition, regardless of whether the text lines 
being replaced are justified at the right margin, simulating 
the last lines of paragraphs accurately may be important in 
some applications, such as, for example, when the iconic 
image is to be used as a surrogate for the original image and 
the display features of paragraph formatting are clues to the 
identity of the document represented. For example, it can be 
seen from FIG. 9 that the encoded data blocks of line 34 end 
before reaching the full length of line 34, approximately 
where the last line ends in the paragraph of text in original 
text image 10 in FIG. 3; in FIG. 10 it can be seen that the 
remainder of line 34 is filled with a run 36 of background 
color pixels. Thus, even if there is additional data to encode, 
the remainder of line 34 is left empty to simulate the 
abbreviated length of the last line of a paragraph. 

To simulate both left and right text justification, each row 
(simulating a text line) of horizontally positioned encoded 
data blocks must have the same length. This is accomplished 
by using the regions of background color, referred to as 
interblock spacing in FIG. 10, that separate each encoded 
data block from the previous and succeeding blocks to adjust 
the positioning of the encoded data blocks placed on the line. 
The general procedure is as follows: position as many 
encoded data blocks In the sequence of blocks to be posi- 
tioned as will fit on the line without exceeding the maximum 
line length, using a minimum allowed value of interblock 
spacing between each block; then increase the amount of 
interblock spacing between each block until the required line 
length is obtained. 

An alternative procedure for producing equal length rows 
of blocks is to add an encoded data block to each line that 
has the width required to make the justification. This data 
block must be at a known (or computable) position in each 
line, so that the added block is able to be distinguished from 
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the actual encoded data during a decoding operation. For 
example, the added block can always be the last block 
positioned in each full line of blocks. Note also that a 
combination of the techniques of adjusting the interblock 

5 spacing and adding a block to each line may be used to 
produce horizontal rows of encoded data blocks that have 
the same line length. 

In a variation of the general principles of encoding just 
described, special linestart or line-end markers may be 

10 inserted at the beginning or end, respectively, of each 
horizontal row of encoded data blocks in order to assist in 
the decoding process. Such markers may be some small 
number of pixels in width that make them clearly distin- 
guishable from encoded data blocks; a width of two or three 

15 pixels is adequate to mark each row and yet does not 
produce marks in the iconic image 4 'paragraphs" that are 
easily perceptible or distracting to a viewer of the iconic 
image. The markers may have the same height as the block 
height of the encoded data blocks so as to minimize the 

20 likelihood of their being noticed in the iconic image. 

Specific examples of encoding operations suitable for the 
present invention are discussed in the previously referenced 
Iconic Image Encoding application. The size of the iconic 
image and the regions available for encoding information 

25 necessarily limit the amount of information a single iconic 
image can carry, and some applications will require selection 
of a particularly efficient encoding scheme. General prin- 
ciples of information theory as applied to signal encoding 
can be used to evaluate the efficiency of a particular encod- 

30 ing scheme. In particular, evaluating run length limited 
(RLL) codes using known principles can aid in the selection 
of a reasonable RLL encoding scheme for a particular 
application of an iconic image in light of the type and 
quantity of information being encoded. In magnetic record- 

35 ing technology, RLL codes are characterized by the param- 
eters [d, k] where d represents the minimum and k represents 
the maximum number of 0s between two consecutive Is in 
a coded sequence of binary data. Since one of the goals of 
encoding data in the iconic image is to produce perceptible 

40 blocks in a foreground color, simply reversing the polarity of 
a selected RLL code produces length-limited runs of l's 
each separated by a single zero, which consequently produce 
blocks of foreground color pixels separated by background 
color regions in the iconic image. Information on RLL 

45 encoding is found in numerous textbooks and articles on 
information theory, magnetic recording, and other related 
signal encoding topics. See, for example, Magnetic Record- 
ing Volume II: Computer Data Storage, C. D. Mee and E. D. 
Daniel, eds., McGraw-Hill Book Company, New York, 

50 1988, Chapter 5. See also, Norris and Bloomberg, "Channel 
Capacity of Charge-Constrained Run-Length Limited 
Codes," IEEE Transactions on Magnetics, Vol. MAG- 17, 
No. 6, November, 1981, pp. 3452-3455, (hereafter "the 
Norris and Bloomberg article") which is hereby incorpo- 

55 rated by reference for all that it teaches. The efficiency of an 
RLL code is evaluated by a measurement called the average 
channel rate, which measures the number of bits of data 
encoded per cell. 

^ B. Decoding an Iconic Image 

Decoding the message of binary data 70 (FIG. 8) from 
iconic image 20 (FIG. 9) involves two broad operations: 
identification of the region or regions in the iconic image 
that contain the encoded data blocks, and decoding of the 
65 message from the blocks. TTiese broad operations are the 
same regardless of whether the iconic image to be encoded 
is provided by an image data capture operation that converts 
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a physical document on which the iconic image is rendered typically has no OFF pixels. ''Opening" is a morphological 

to digital data, or is provided as an original electronic image operation that consists of an erosion followed by a dilation, 

in the form in which it was encoded However, particular The result is to replicate the ON pixels in the SE in the 

details of locating the encoded data blocks may be imple- destination image for each match of the ON pixels in the 

mented differently for iconic images in their original elec- 5 source image. "Closing" is a morphological operation con- 

tronic form and that have not been previously scanned ox are sisting of a dilation followed by an erosion. Far opening and 

not provided by an image data capture operation, since closing, the result does not depend on the center location of 

image processing operations designed to account for the me SB since each operation includes successive comple- 

noise typically introduced by an image data capture opera- mentary operations with the same SE. Mormation about 

tion would not necessarily be required. In addition, the 10 mat I*<>lQgical image processing is available in a number of 

electronic form of the iconic image may contain the regions texts and artides - For example, image-based approaches to 

of encoded data blocks as rectangles. document image analysis based on image shape and texture 

The following description of a decoding operation is D S f ' ? loomber % " Mul ^^ f u - 

_ . , B . £ , tTj/u . tion morphological analysis of document images", SPIE 

^r^nf^^c 0 T^^^ ^ C ™/- ^ Visual Communications and Image 

carry the binary ^message in a vanaton cf their lengths, and 15 Processing^ Boston, Mass., November 1992, pp. 

not in then heights or in their vertical positions with respect which is hereby incorporated herein by reference 

to a baseline, and include operations that account for noise as if set out in full 

introduced by the image capture process. A brief discussion The morphological operations used to locate encoded 

then follows of the different considerations needed in decod- block regions comprise two filtering operations: first, a 

ing encoded data blocks that carry the binary message in a 20 morphological closing with a small horizontal structuring 

variation of height or in their vertical positions with respect element is used on the input image to produce a resulting 

to a baseline. image, referred to as Rl; this operation will cause the EDBs 

1. Finding Image Regions of Encoded Data Blocks. to horizontally merge, fonning thin horizontal lines in Rl. 

FIG. 11 provides a general flowchart of the decoding Then, a hit-miss structuring element is used on Rl to locate 

operation 300 of FIG. 1. An input image to be decoded has 25 these thin horizontal lines; that is, the hit-miss stracturing 

regions of encoded data blocks located within it, but in many element projects out of the Rl image where the thin hori- 

applications of iconic images these locations are not likely zontal lines are located. The hit-miss structuring element 

to be known in advance. A region in iconic image 20 that is would typically be of a form such as element 312 shown in 

composed of horizontal rows of elongated rectangular FIG. 12. This is a filter that is placed, in effect, at every 

blocks, each of which has approximately the same height 30 possible location on (or over) the Rl image. At each location 

will be referred to hereafter as an encoded block region. in the Rl image, the result is either a match with SE 312 or 

Standard image processing operations may be used, in no match. If a match is found, an ON pixel is written at this 

box 310, to reliably locate the bounding box coordinates of location in a resulting R2 image; otherwise a 0 pixel is 

the encoded block regions in an input image. By way of written in the R2 image. Thus the result of the operation is 

example, the following process may be used, but other 35 to produce a binary image, R2, with ON pixels wherever 

image processing operations may also be suitable. This filter 312 matches at a location in the Rl image and OFF 

process assumes that each encoded data block — also pixels elsewhere. The conditions for a match between filter 

referred to in this discussion as an EDB — has approximately 312 and a location on the Rl image are (1) all pixels in the 

the same height, and that this height is known. The begin- Rl image "below" (or at the location of) the 1 values in filter 

ning or ends of these lines may be composed of specific 40 312 must be ON and (2) all pixels in the Rl image below the 

"line-start/line-end" markers that are all of an identical 2s in filter 312 must be OFF. The pixels below the 0s are not 

shape, distinguishable from the blocks. Location of the tested. This filter is well-suited to finding horizontal lines 

encoded block regions can be done in two steps. In the first that are about 5 pixels wide. The horizontal extension of the 

step, image-based morphological operations can be used to filter should be long enough to eliminate accidental matches 

locate likely candidates for these regions. In the second step, 45 from most elements of the image that are not joined or 

the candidates are evaluated to see if they conform to the merged EDBs. 

expected shapes. The hit-miss operation is then followed by a dilation 
Morphological operations map a source image onto a operation performed on resulting image R2 using the ON 
destination image according to a rule defined by a pixel pixels of structuring element 312 (FIG. 12). This operation 
pattern called a structuring element (SE). The SE is defined 50 expands the horizontal lines to be approximately the same 
by a center location and a number of pixel locations, each length they were in Rl, the output of the closing operation, 
having a defined value (ON or OFF). Other pixel positions, The dilation operation produces resulting image R3. 
referred to as "don't care," are ignored. The pixels defining Resulting image R3 will contain a set of thin horizontal 
the SE do not have to be adjacent to each other. The center lines that potentially mark the locations of the EDBs; 
location need not be at the geometrical center of the pattern; 55 additionally, there may be a few other places in R3 with ON 
indeed it need not even be inside the pattern of ON and OFF pixels. Then, a morphological closing operation is used with 
pixels. By way of background, several common morpho- a small vertical structuring element, large enough to join the 
logical operations operate as follows: "Erosion" is a mar- horizontal lines; this operation will solidify the thin hori- 
phological operation wherein a given pixel in the destination zontal lines into a block of ON pixels, while having rela- 
image is turned ON if and only if the result of superimposing CO tively little effect on the other ON pixels which will remain 
the SE center on the corresponding pixel location in the scattered. The resulting image R4 produced by the morpho- 
source image results in a match between all ON pixels in the logical closing operation can then be searched for these 
SE and the underlying pixels in the source image. "Dilation" blocks of solid ON pixels that are candidates for the encoded 
is a morphological operation wherein a given pixel in the block regions containing the encoded data blocks. A corn- 
source image being ON causes the SE to be written into the 65 mon method is to look for bounding boxes of connected 
destination image with the SE center at the corresponding components, and to select only those bounding boxes that 
location in the destination image. An SE used for dilation are sufficiently large, thus eliminating the "noise" pixels. 
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Once the candidate bounding boxes have been located, a appropriate criterion for the occurrence of an edge between 

verification operation is needed to ensure that encoded data them. The term "edge pixel" may be applied to one or both 

blocks have been identified correctly. Using the original of two neighboring pixels between which an edge occurs. A 

input image being decoded, in each region identified in the set of pixels in an image is "connected" if each pixel has at 

first step, the connected components in the original image 5 least onc neighboring pixel that is in the set and if each pair 

are found and their sizes and locations are analyzed. (The of P«ds * set m connected by a subset of other pixels 

definition of a connected component is provided below.) * *f set 0n ? <* ™<*f connected set of pixels bounded by 

This analysis includes determining whether the connected W ca ? ed a connected component . 

components are all about the same height and have widths ™™ ? component roughly represents an 

thaTvary between expected limits. OnTway in which this 10 cnood « l *f b1 ^ *? ut fo1 'ff^fF*?™. i^uu 7T 
* 7 v*™™? ~~ . . J u . accurate information is needed about the size of each block, 
may be accomplished is to take die bounding box around ^ s ^ sticsa data « about ^ 
each connected component and shrink it by two pixels in ^ 51ock for ^ purpose of ^signing a block 
each direcuon, to produce a smaller bounding box. When a length, block height and vertical baseline" position to each 
connected component is an encoded data block, this Uock. The statistical data to be collected includes the 
reduced-size bounding box has eliminated the most common is distribution of block lengths, block heights and vertical 
variabilities introduced by image capture noise and should locations of the blocks, all in units of pixels. These distri- 
be a solid rectangular block of foreground color pixels. In buttons are typically presented in the form of histograms, 
addition, this analysis should determine whether the con- Data indicates a "distribution" of an image characteristic 
nected components are organized in a two-dimensional when it has a value that depends on a number of measure- 
pattern, with parallel components laid out as if they were 20 ments of the image characteristic. For example, data can 
sitting on parallel horizontal lines. Another useful piece of indicate a distribution of distances or of measurements of 
analytical information is to produce the variance of the another scalar quantity by indicating the central value of the 
horizontal and vertical run lengths in each connected com- measured distances; a measure of the variance of the mea- 
ponent; the less variance found in each block, the more sured distances; or a measure combining the central value 
likely it can be concluded that a candidate region is an 25 and variance. Data can also indicate a distribution of dis- 
encoded data region, since text regions that have been tances by indicating frequency of each distance or by 
reduced and not replaced with the regular encoded data indicating distances at which maxima of frequency occur. A 
blocks are likely to show more variance in the vertical and "histogram" is data that indicates a distribution of an image 
horizontal run lengths of the connected components. Various characteristic by indicating frequency of occurrence of the 
histogram techniques may be used to develop the data 30 values of the image characteristic. For example, if an image 
needed for this analysis. It is also useful to extract the characteristic is measured over a range of magnitude or size, 
median length (or height, if height encoding is used) from a histogram can indicate frequency as a function of the 
the data developed during the variance analysis; as will be magnitude or size. The range can be divided into parts and 
seen below, the median value may be used to assign a data the histogram can indicate the number of measurements 
value to each block for purposes of decoding. 35 occurring in each part. Thus a histogram can be used to find 

The verification process may include locating the begin- maxima of frequency, for example, 

ning or end of line markers. As noted previously, the The lengths of the encoded data blocks are measured, in 

beginning or end of the horizontal rows of encoded data box 324, using the bounding box of the connected compo- 

blocks may be encoded with specific line-start or line-end nents; this type of measurement is likely to produce a 

markers that are all of an identical shape. These markers are 40 distribution of lengths that center around certain prominent 

designed to be easily distinguishable from the encoded data length values, called "quantized" lengths, with a variation of 

blocks, and make detection of the region in iconic image 20 only a few pixels between the quantized lengths. For a more 

that contains the encoded data a more straightforward opera- accurate measurement, the length and location of each pixel 

tion. . row is used to determine the median row length far the 

2. Assigning Quantized Values to the Encoded Data Blocks 45 block, and a variance of the median row length measurement 

Using Histogram Analyses. is also developed. In order to develop this more accurate 

Once the bounding box of each encoded block region is measurement, the short run lengths in each connected corn- 
determined, the operation of decoding the binary data from ponent must be eliminated. These short run lengths are runs 
the blocks within each bounding box follows next The that do not extend the full length of an encoded data block 
regions verified to contain EDBs can be extracted from the 50 as a result of noise introduced by the scanning operation, 
image (i.e., copied to another image) for further analysis and Three methods can be used to accomplish this: one or mare 
for decoding the information from each region individually. pixel rows near the top and bottom of each connected 
Hie size and position of each encoded data block in an component may be discarded; or runs of foreground color 
encoded block region needs to be determined in order to pixels having a length less than a small value shorter than the 
decode the message embedded in the region. 55 longest run may be eliminated; or both of these conditions 

Connected components are identified in each encoded may be implemented, 

block region, in box 320 of FIG. 11, and the bounding boxes This measurement process also produces a distribution of 

of each connected component is determined. For purposes of lengths that center around quantized block lengths. The 

establishing a common terminology framework for discuss- accuracy (or reliability) of this measurement is inversely 

ing the present invention, image locations such as pixels are 60 related to the size of the variance in the measurements. The 

"neighbors" or "neighboring" within an image when there best values for each quantized length to be assigned to 

are no other pixels between them and they meet an appro- encoded data blocks are then determined from this distribu- 

priate criterion for neighboring. If the pixels are rectangular tion data. This is typically done by taking the median size for 

and appear in rows and columns, each pixel may have 4 or those measurements determined to be from blocks at each 

8 neighboring pixels, depending on the criterion used. An 65 row length. 

"edge" occurs in an image when two neighboring pixels Quantized values of the block heights and the vertical 

have sufficiently different pixel values according to an locations (i.e., a "baseline" position of a row of encoded data 
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blocks) of the rows of encoded data blocks are developed in or line-end markers, when used, may be specifically added 

a manner similar to that of the block lengths, in boxes 326 to lines of encoded data blocks at the reference baseline or 

and 330 of FIG. 11, since the uniform and regular placement topline position, or may establish both positions; or one or 

of the rectangular blocks during encoding suggest that these two special-purpose EDBs may be added in each line that 

values are expected to differ by only one or two pixels 5 have a fixed, reference height For example, the first and last 

between blocks. To measure the block heights and vertical EDB in each line can have fixed top and bottom raster 

locations most accurately, each pixel column in a connected positions that provide the reference for top and bottom lines 

component is measured, and the median and variance are in all EDB for that line, A combination of fixed position 

then used to determine the data value and its reliability. Pixel encoded data blocks and line-start or line-end markers may 

columns near the left and right edges are discarded because to also be used to mark the reference topline and baseline. The 

they may not extend the full height, again resulting from the decoding operation can take advantage of this information, 

introduction of scanner noise. Again, the best value(s) for if it is known in advance, or the reference blocks can be 

quantized block heights) are determined. For encoded data detected using image processing operations, 
blocks that have been encoded having the same block height, 

a single quantized block height level is expected. 15 Dt Applications of Iconic Images in Document 

3. Decoding the Message from the Quantized Data Values. Image Management Operations 

These quantized values are then used to assign data values FIGS. 13-19 illustrate various examples of applications 
indicating the quantized length, height, and vertical block of iconic images in document image management opera- 
position of each block, in box 334. The values of data blocks tions. These examples are intended to be illustrative only, 
that do not contain encoded data, such as blocks that are 20 and not exhaustive of all possible types of operations, and 
added to give line justification, are discarded, in box 338, each example is itself subject to variations that will be 
from the data to be used to extract the encoded message. The apparent to those of skill in the art from the descriptions that 
values assigned to the encoded data blocks are then ordered, follow. 

in box 340, as the encoded data blocks are ordered in the 1. Retrieving a Text Document Image from a Data Base 

encoded block region —by line and by sequential position 25 Using Location Identification Data Encoded in a Represen- 

within each line. These ordered values of lengths of fore- tative Iconic Image. 

ground colors and their positions provide the message bit FIG. 14 illustrates document image management opera- 
pattern of 0*s and l's from which the data message can be tion 500 for retrieving a displayed or printed version of an 
decoded; this message bit pattern is then produced in box original text image from a document image data base or 
350. Finally, the data message is decoded from the extracted 30 other repository, using location identification information 
message bit pattern, in box 360, using formatting informa- encoded in an iconic image representation of the full-size 
tion about the encoding operation. version of the original image. FIG. 13 illustrates operation 

The format of the encoding operation specifies whether 515 for creating the document image data base, and for 

there is parity or other error correction code data, as well as producing and encoding the iconic image having the proper 

whether there is "meta-data" about the message, such as the 35 location identification information. A plurality of text 

number of bytes in the encoded message or the number of documents, including text document 10 of FIG. 3, are 

encoded "text" lines in the iconic image, or other informa- provided as input to operation 515 which then loads them 

tion about the encoding format or about the encoded mes- into a data base or repository. Operation 515 also generates 

sage. Some aspects of the formatting must be known a identifying location information for each document that is 

priori; while others can be determined from the data. For 40 stored. Operation 515 then produces an iconic image version 

example, the data encoded in the height and vertical location of each full-size original text image and encodes the iden- 

of the blocks can be known to specify meta-data— mat is, tifying location information in the text regions of the iconic 

data about the message, such as the amount and type of ECC image according to the processes described in the previously 

that has been appended. This formatting information is used referenced Iconic Image Encoding application. Operation 

to identify and verify the bits that carry the data message; 45 515 produces the text document image data base 518 and the 

these bits can then be decoded; in many cases, a lookup table iconic image data 52 with the encoded identification location 

of selected, or of all possible, bit patterns with correspond- information as outputs/Iconic image data 52 may be sub- 

ing decoded data bits, may be used to complete the decoding sequently printed as iconic image 23, or displayed as iconic 

operation. image 27 on a display device. 

In order to decode a message encoded in the heights and so In document image management operation 500 of FIG. 

vertical positions of the encoded data blocks rather than, or 14, iconic image data 52 is provided as input to decoding 

in addition to, in their lengths, the decoding steps of FIG. 11 operation 300, and produces decoded document image loca- 

are essentially carried out in the same manner, with the tion identification data 532, which is then input to a retrieval 

additional step of establishing a reference baseline and operation 540. Retrieval operation 540 locates the requested 

topline for each horizontal line of connected components, in 55 document image in repository 518 using the decoded loca- 

order to then determine the amount of shift above or below tion identification data 532, and produces the original text 

these reference points during decoding Thus, each line of document image file 542, which may then be displayed or 

connected components is assigned a reference baseline and printed as requested. 

topline when data values are assigned to each connected In one embodiment of operation 515 of FIG. 13, docu- 

component in box 534. Decoding then proceeds as illus- 60 ment images are loaded into a document image repository 

trated in FIG. 11: a message bit pattern can be determined located on a server that is accessible via the World Wide Web 

from the height values as ordered by encoded data block and (WWW). Each document contains a unique identifying 

their respective displacements from the reference baseline location known as its Uniform Resource Locator, or URL. A 

and topline in each line. The data message can then be document image may be retrieved from document image 

decoded from the extracted message bit pattern. 65 repository 518 by sending the server a request with the 

Reference baseline and topline positions must be specifl- document's URL. The identifying URL data is encoded in 

cally encoded. Two techniques are suggested; the line-start the iconic image version of the original document image 
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such that each iconic image serves as an index into the 
document repository for locating a particular document 

The indexing capability of each iconic image generated 
by operation 515 may then be exploited in a number of 
different ways in this embodiment. For example, a hardcopy 5 
printed version of all iconic image versions of documents 
stored in repository 518, much like document 188 of FIG. 5, 
may serve as a complete, secure, lightweight, and portable 
index into the data base. A user who possesses a copy of the 
iconic image index and who has access to the WWW may 10 
simply select one of the images by marking it with a circle 
or "X" or other identifying selection mark, and scan the 
document as shown in FIG. 5 to produce iconic image data 
52 of FIG. 14. Decoding operation 300 and retrieval opera- 
tion 540, stored on the server, or otherwise associated, with 15 
document repository 518, accept iconic image data 52 as 
input and perform the retrieval of the desired document as 
previously described, displaying the retrieved document 
image for viewing or printing by the user. Because the URL 
data is encoded in each iconic image , there is no perceptible 20 
identifying information about the documents on the hard- 
copy index. Moreover, the encoding is robust enough to 
permit copies of the portable index to be easily made, and to 
permit robust and reliable decoding from an image that may 
be contaminated with noise from a physically dirty docu- 25 
ment. 

2. Enhancing the Performance of Character Recognition 
Operations on a Text Document Image Using Data About the 
Image Encoded in a Representative Iconic Image. 

The iconic image of the present invention may also be 30 
used in contexts where electronic data structure versions of 
documents are not readily available, and printed documents 
are frequently expected to be used as source documents to 
produce electronic versions of document data structures 
using a character recognition operation. Here, the iconic 35 
image may be used to encode data for use by a character 
recognition operation, to enhance the performance of the 
recognition operation. In this context, the iconic image 
serves to make the printed document a more reliable source 
of information. 40 

FIG. 16 illustrates character recognition operation 600 for 
performing character recognition on a text document image 
to generate a transcription of the text image; operation 600 
makes use of information encoded in an iconic image 
representation of the full-size version of the original image 45 
in order to improve the accuracy of the recognition opera- 
tion. FIG. 15 illustrates word processing operation 610 
which is a conventional word processing operation with the 
added capability of producing and encoding an iconic image 
version of the document being produced by operation 610. 50 
A user of operation 610 edits a text document that is 
represented by text document data structure 614 using the 
conventional editing features of word processing operation 
610. At the conclusion of document editing, operation 610 
has the available function of producing an iconic image 55 
version of the text document image that includes data about 
text document 614 encoded in the text regions of the iconic 
image according to the processes described in the previously 
referenced Iconic Image Encoding application. The iconic 
image may, but need not, have lines in correspondence with 60 
the text lines of the actual document For example, each line 
of the iconic image may hold data about the corresponding 
line of the document The type of information that may be 
encoded includes information such as a page count of the 
document, a word and line count on each page of the 65 
document, a word count on each line on that page, or the 
word lengths of the words on each line on a page of the 
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document. There may be a different iconic image produced 
for each page of the text document being produced, or there 
may be simply one iconic image for the entire document, 
depending on the amount of information that needs to be 
encoded. Operation 610 then produces as an output data 
structure a text document image file 620 which includes the 
text document image rendered from data structure 614 and 
the iconic image produced by the iconic image encoding 
feature of operation 610. The text document with the iconic 
image may be subsequently printed as document 183. Iconic 
image 20 printed on document 183 includes the embedded 
encoded data that is useful for character recognition is 
regions 24 and 28. 

With reference now to FIG. 16, when printed document 
183 is scanned, as shown for example in FIG. 4, the text 
document image data with the iconic image having the 
encoded character recognition data is produced as output 
and is then input into decoding operation 300. The decoded 
data 632 is then input to a character recognition operation 
640 along with the text document image data 620 for use in 
performing recognition on the text document image, to 
produce a transcription file 642 of the text image. Character 
recognition operation 640 includes functionality that uses 
decoded data 632 to verify portions of the transcription to 
improve its accuracy. For example, if decoded data 632 
includes a word length for each word on each line in the 
document image, these word lengths can be compared to the 
length of the words produced by the character recognition 
operation in each of the lines in text document image 620; 
image areas in a particular line that may have been degraded 
by noise or that otherwise produce unrecognizable charac- 
ters are more likely to be correctly recognized when these 
word lengths for words on the line are known a priori. 

While FIG. 15 has illustrated the functionality of produc- 
ing the iconic image having the encoded recognition data as 
being included in a word processing operation, it is readily 
apparent that this functionality may exist as an independent 
function, or as part of a different document management 
function. The word processing operation is a logical, but not 
the only, place to include this functionality. 
3. Performing Document Authentication using a Represen- 
tative Iconic Image to Store a Digital Signature. 

Another important application of the present invention is 
document authentication. The authenticity of the contents or 
the sender, or both, of an electronic or even of a printed 
distribution of a document is becoming increasingly impor- 
tant as technology provides easier methods to forge or 
change documents without detection. Digital signature tech- 
nology serves to authenticate one or both of the content or 
sender of a document image, and the iconic image of the 
present invention may be used as means to encode a digital 
signature in an unobserved manner; when the digital signa- 
ture is decoded from the iconic image, it can be used to 
authenticate an accompanying, or a separately acquired, 
document image. 

A "digital signature" is a transformation of a message 
using an asymmetric (public-key) cryptosystem such that a 
person having the initial message and the signer's public key 
can accurately determine: (a) whether the transformation 
was created using the private key that corresponds to the 
signer's public key; and (b) whether the message has been 
altered since the transformation was made. A digital signa- 
ture is essentially a cryptographic checksum that is com- 
puted as a function of input message data and a user's 
private key; thus, a user's digital signature varies with the 
message data being "signed". Due to the efficiency draw- 
backs of public-key cryptography, a user often signs a 
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condensed version of a message, called a message digest, the principles of digital signatures to images by altering 
rather than the message itself. A digital signature is created. some quantity of image signals in an original image to 
by running message text through a hashing algorithm to encode data about the image. See also O'Gorman. and 
yield a message digest. The message digest is then encrypted Rabinovich, "A Pattern Recognition Approach to Photo-ID 
using the private key of the individual who is sending the 5 Authentication**, International Conference on Pattern Rec- 
message, turning it into a digital signature. Message digests ognition (ICPR '96), Vienna, Austria, August 1996 
are generated by a hash function that is a one-way function (hereafter, the "O'Gorman and Rabinovich paper"), which 
to the text of the electronic message to be sent A hash discloses a process for producing a <4 photo-signature" that is 
function is a keyless transformation function that, given a a concise representation of a photographic image on a 
variable-sized message as input, produces a fixed-sized 10 document the photo- signature is o^ermined by pattern 
representation of the message as output (i.e., the message recognition techniques applied to the image data; the photo- 
digest). Because the hash function is considered to be signature is either stored in a data base for later authentica- 
secure, signing a message digest provides the same security tion purposes, or it is stored on the document itself in an 
services as signing the message itself. Because the message encrypted form; this latter form of authentication is referred 
digest is derived from me message, each message produces 15 to as self-authentication. The O'Gorman and Rabinovich 
a unique message digest. paper is incorporated by reference herein for all that it 

Digital signature software encrypts the message digest teaches, 
and the hash function using an asymmetric cryptosystem and FIGS. 17, 18 and 19 illustrate two embodiments of the 
the sender's private key. The encrypted digest and oneway present invention for self-authenticating a text document 
function are attached to and sent along with the message to 20 image using a digital signature of the type proposed in the 
the intended recipient. The sender* s public key may also be O'Gorman and Rabinovich paper. With reference to FIG. 17, 
attached to the message as well, if it is not otherwise digital signature operation 710 provides the ability to pro- 
available to a recipient. Software on the recipient's computer duce a digital signature for the text document image repre- 
system then uses the sender's public key to decrypt, the sented by data structure 55 of FIG. 7 which is the document 
message digest and one-way function, and applies the one- 25 image of document 10. Using the terminology in the 
way function to the message to recalculate the message O'Gorman and Rabinovich paper, a photo-signature of the 
digest If the recalculated message digest matches the origi- image content of document image file 55 is first produced 
nal digest, then the recipient has accurate and reliable using one of the pattern recognition methods described, 
verification that (a) the message was not altered in trans- These methods generally involve dividing the image into a 
mission or tampered with, and (b) the originator— i.e., the 30 grid of squares and deriving a number for each square; the 
owner of the private key paired with the public key— sent number derived depends on the average darkness of the 
the message. Verification that the message has not been square and that of its neighboring squares. These numbers 
tampered with is reliable because if the message is changed are collectively the photo-signature and are then used to 
after the original digest is created, the new digest will not represent the content of the image. As in the text domain, the 
match the original one that the sender created and attached 35 same technique is used on the image presented for authen- 
to the message. Verification that the originator sent the tication to produce a target photo-signature which would be 
message is reliable because public key transformation tunc- compared to the original photo-signature. As long as the 
tions are 1-way (Le., not forgeable); only the sender's public image may be consistently divided into essentially the same 
key can decrypt a message that was encrypted with the squares, the same photo-signature win be produced, 
sender's private key. Thus, although a digital signature is not 40 While the photo-signature technique is described in the 
a handwritten signature, the processes of creating a digital O'Gorman and Rabinovich paper in the context of a pho- 
signature and verifying it provide electronically at least the tographic (continuous tone or halftone) image, the technique 
same effects— if not more — that a handwritten signature on may be adapted to adequately represent the content of a text 
paper provides. That is, a digital signature functions to image as well, and the term "photo-signature" as used in the 
ensure that a message is authentic, its integrity has not been 45 context of the description of the example illustrated in FIGS, 
compromised, and the sender cannot disavow or repudiate 17, 18 and 19 is intended to include a digital signature for 
the message after sending it. an image that contains text For example, for a binary image, 
The present invention may serve as an aid in reliably the number derived for a square may be a "gray" value 
providing the sender's public key by providing the capabil- derived from the fraction of black pixels in a NxN square, 
ity of encoding the public key unobtrusively in the iconic 50 Or, the number of edge pixels at various orientations may be 
image, along with the digital signature. The public key and used to produce a number for a grid square. Once this type 
digital signature could then travel with the document in any of measure of the content of a square is detemined, the 
form the document took, and would always be available for elements of the digital signature may be derived by one of 
authentication. Even if the public key is not encoded, and is the methods proposed in the O'Gorman and Rabinovich 
provided from a trusted repository, the digital signature, 55 paper (e.g., comparing with neighboring values), or by a 
when encoded in an iconic image version of the document modification thereof. 

which is printed on the document, would always travel with The methods proposed by O'Gorman and Rabinovich 

the document image. require registration marks on the document in order to 

Digital signatures are typically discussed in terms of generate rectangular (or square) regions of the image in a 

authenticating messages in the form of conventional text, 60 completely repeatable and reliable fashion because some 

and in this context, the original message text must be scaling, skew and arbitrary translation may occur when the 

available for authentication purposes. In many document image is scanned. Three registration points are the minimum 

image management tasks, however, the original message required if the transformation is affine (a linear transform 

text in an electronic code form such as ASCH is not with translation, rotation and uniform scaling), 

available. ^ However, technology has been developed to apply 65 In the self-authentication option proposed in the 

the principles of digital signatures to digital images. The O'Gorman and Rabinovich paper, the photo-signature is 

Digimarc™ technology discussed above appears to apply then encrypted using an encryption operation and using the 
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private key of the sender or document originator. After the 
encrypted photo-signature is produced, iconic image encod- 
ing operation 718 produces an iconic image version of the 
text document image with the encrypted photo-signature, the 
function used to produce the signature, if necessary, and, 5 
optionally, the sender* s public key encoded in the text image 
regions of the iconic image according to the processes 
described in the previously referenced Iconic Image Encod- 
ing application. Then, assuming the document image with its 
digital signature is going to be transmitted or otherwise used 10 
in its image form, a rendering operation (not shown) pro- 
duces a file 720 of the text document image data represent- 
ing document 10 with the iconic image data having the 
encoded photo-signature, the signature function, and the 
optional public key of the sender. When file 720 is subse- 15 
quently transmitted, received and printed or displayed, docu- 
ment 187 includes iconic image 20 with the digital signature 
data encoded in regions 24 and 28. Operations 710 and 718 
may be implemented as separate software programs or as 
one combined operation. 20 

Digital signature verification operation 700 illustrated in 
FIG. 18 follows much the same pattern as the previously 
described embodiments of document image management 
operations. Document 187 of FIG. 17 is scanned in the 
manner described in FIG. 4, and the document image data 25 
722 is provided as input to iconic image decoding operation 
300, which decodes the binary data encoded in the iconic 
image region of document 187 to extract the photo- 
signature, and the signature function and the public key, if 
these are provided. This decoded data is then input to digital 30 
signature verification operation 740, along with the text 
document image data portion of image data 722. Verification 
operation 740 is related to operation 710: operation 740 
produces the photosignature for the text document image 
data that was actually received, referred to as the first digital 35 
signature, in the same manner as operation 710, using the. 
decoded signature function; operation 740 then decrypts the 
decoded photo-signature using the public key, referred to as 
the second digital signature. The first and second digital 
signatures are then compared, and if they are identical, or 40 
close within some predetermined threshold, the document is 
determined to be authentic. FIG. 19 illustrates operation 702 
which is a variation of operation 700 that allows for the 
receipt of document text and iconic images from separate 
sources. 45 

In a variation of the digital signature application just 
described, an iconic image may be used to carry identifying 
information about one or more attributes of the text docu- 
ment image— referred to as "meta-data" — that is not derived 
from or dependent on the content of the image and is thus not so 
a digital signature per se. This information may be encrypted 
using the private key of the sender or document originator — 
who presumably is the only source of the private key. This 
encrypted information may then be encoded in the iconic 
image. In addition, the iconic image may be produced using 55 
a very faithful reproduction and encoding technique such 
that the iconic image clearly and distinctly shows the 
appearance of the document image it represents. The iconic 
image in this embodiment serves as what might be called a 
"weak" authentication of the original text image: if the 60 
iconic image visually appears substantially similar to the 
original document and the decoded data from the iconic 
image is successfully able to be decrypted using the public 
key, it can be reliably determined that the sending authority 
in possession of the private key sent the document image 65 
carrying the iconic image, even if the content of the image 
cannot be authenticated. Verification of the sending author- 
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ity alone will be sufficient in some applications to authen- 
ticate the document and thus the operation of producing the 
photo-signature may be eliminated. 

4. Iconic Images Used as Document Tokens 

As noted earlier, an iconic image need not be the reduced 
size version of a specific text image, but rather may be a 
reduced version of a text document image that is represen- 
tative of a class, or genre, of text document images. Cat- 
egory 458 in FIG. 2 refers to this use for an iconic image; 
when an iconic image is used in this manner, it may be 
generally characterized as a "document token." An example 
of such a use may be described with respect to modified 
versions of FIGS. 13 and 14. Text document images such as 
image 10 may be cataloged and stored in a document 
repository according to the document class they belong to. 
So, for example, document repository 518 may hold aca- 
demic journal articles by various authors, business letters, 
project correspondence, and other conventional types of 
documents used by a typical business. An iconic image may 
be generated by operation 515 for each class of document in 
repository 518, together with title and location identifying 
information for each document in the class, or for a subset 
of documents in the class, when input data is provided to 
operation 515 as to how to select the subset Decoding of the 
iconic Image in operation 500 of FIG. 14 would then permit, 
or restrict, access only to those documents encoded in the 
iconic image. Possession of an iconic image, then, would 
provide the holder of certain document privileges with 
respect to the documents in the repository. 

5. Other Document Image Management Applications Using 
Iconic Images. 

A variety of different types of information may be 
encoded in the iconic image related to the use and distribu- 
tion of either the full-size original image or the iconic image. 
For example, the iconic image may store encoded informa- 
tion related to when the iconic image was produced, such as 
a time stamp, or identifying information about the owner of 
the full-size image or the owner of the machine that gener- 
ated the iconic image. 

The possessor of the iconic image may have certain usage 
rights encoded therein pertaining to the use of the original 
image, such as whether it can be distributed to others, 
whether it can be printed, and how many copies may be 
made. In a commercial usage rights system, an iconic image 
might be purchased with certain rights encoded in it to 
permit control over access to certain document repositories. 
For example, a full-size image may show an abstract or 
review of a longer document, article, or book. The iconic 
image that represents the full size image contains the usage 
rights the possessor of the iconic image has with respect to 
access to the complete document 

An iconic image may serve as a convenient and unobtru- 
sive storage mechanism far cross-references to the locations 
of other information. For example, an iconic image version 
of a bibliography of references could store, for each refer- 
ence in the bibliography, a data base file name or other 
identifying location information of the location of the full 
text of the reference. When the iconic image is scanned and 
decoded, these references are available to a user for retrieval 
and viewing. In combination with the encoding of usage 
rights information in the iconic image for each reference, the 
decoded information may also further specify whether a 
particular reference is able to be printed or transmitted to 
another location. 

E. The Machine Environment of the Invention 

FIG. 20 is a block diagram of a generalized, processor- 
controlled machine 100; the present invention may be used 
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in any machine having the common components, 
characteristics, and configuration of machine 100, and is not 
inherently related to any particular processor, machine, 
system or other apparatus. The machine or system may be 
specially constructed and optimized for the purpose of 
carrying out the invention, or it may comprise a general 
purpose computer selectively activated or reconfigured by a 
computer program stored in the computer, or it may be a 
combination of a general purpose computer and auxiliary 
special purpose hardware. When a machine such as machine 
100 is suitably programmed to embody the present 
invention, the machine is not a standard or known configu- 
ration. 

An image signal source 158 provides the input image data 
required by document image management operation 200. 
Image signal source 158 may be any image data capture 
device, such as a scanning device, a digital camera, or an 
interface device that produces a digital image definition data 
structure from another type of image signal An input iconic 
image or an input text image with or without its iconic image 
provided by image signal source 158 is forwarded via image 
input circuitry 156 to processor 140 and may be stored in 
data memory 114. 

Machine 100 may also include input circuitry (not shown) 
for receiving signals from a signal source (also not shown.) 
A particular document image management operation may 
require non-image data, shown as other data 58 in FIG. 1. 
Such sources include signals from another processor per- 
forming an operation, or signals from a memory device. This 
signal source may also include user interaction devices 
controllable by a human user that produce signals in 
response to actions by the user, such as a pointing device or 
a keyboard. Another type of user interaction device is a 
stylus device that the user moves over a special data col- 
lecting surface, which might be the display area of a display 
device (not shown). These input signals are also forwarded 
via input circuitry to processor 140 and may be stored in data 
memory 114. For some document image management opera- 
tions performed according to the present invention, machine 
100 includes a conventional display device 170 capable of 
presenting images, such as a cathode ray tube, a liquid 
crystal display (LCD) device, a printing device, or any other 
device suitable for presenting images. Display device 170 
and its associated output circuitry 160 are shown as having 
dashed line outlines to indicate that these components may 
not be necessary in all implementations of the present 
invention. 

Processor 140 operates by accessing program memory 
110 to retrieve instructions, which it then executes. Program 
memory 110 includes document image management instruc- 
tions 200 that implement the functions shown in flowchart 
200 of FIG. 1. Program memory 110 includes instructions 
for the subroutines needed to accomplish the document 
image management task according to instructions 200. Dur- 
ing execution of the instructions, processor 140 may access 
data memory 114 to obtain or store data necessary for 
performing its operations. Data memory 114 stores the 
image definition data structure 55 denning the original input 
image as well as the image definition data structure 56 
defining the iconic image version. Data memory 114 also 
stores the decoded binary data 116 that is decoded from 
iconic image data 56 by decoding subroutine 300 of FIG. 11. 
Data memory 114 also stores various other miscellaneous 
data, including any other data 58 (FIG. 1) that is used by 
document image management operation 400 in addition to 
decoded data 116. 

The actual manner in which the physical components of 
machine 100 are connected may vary, and may include 
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hardwired physical connections between some or all of the 
components, as well as connections over wired or wireless 
communications facilities, such as through remote or local 
communications networks and infrared and radio connec- 
5 tions. Program memory 110 or data memory 114, for 
example, may include memory that is physically connected 
to processor 140 as local memory, or that is remotely 
accessible to processor 140 by means of a wired or wireless 
communications facility (not shown.) 

10 F. The Software Product of the Invention 

FIG. 21 shows software product 120, an article of manu- 
facture that can be used in a machine that includes compo- 
nents like those shown included in machine 100. Software 
xs product 120 includes data storage medium 130 that can be 
accessed by storage medium access circuitry 150. Data 
storage medium 130 stores instructions for executing the 
method of the present invention for performing a document 
image management operation using an iconic image version 
20 of an original input text image, as illustrated in FIG. 1, and 
may include instructions for performing the method accord- 
ing to one of the illustrated embodiments of the invention 
illustrated in the flowcharts of FIGS. 14, 15, 16, 18 and 19. 
Software product 120 may be commercially available to 
25 a consumer in the form of a shrink-wrap package that 
includes data storage medium 130 and appropriate docu- 
mentation describing the product In that case, a data storage 
medium is a physical medium that stores instruction data. 
Examples of data storage media include magnetic media 
30 such as floppy disks, diskettes and PC cards (also known as 
PCMCIA memory cards), optical media such as CD-ROMs, 
and semiconductor media such as semiconductor ROMs and 
RAMs. As used herein, "storage medium" covers one or 
more distinct units of a medium that together store a body of 
35 data. For example, a set of disks storing a single body of data 
would be a storage medium, "Storage medium access cir- 
cuitry" is circuitry that can access data on a data storage 
medium Storage medium access circuitry 150 may be 
contained in a distinct physical device into which data 
40 storage medium 130 is inserted in order for the storage 
medium access circuitry to access the data stored thereon. 
Examples of storage medium access devices include disk 
drives and CD-ROM readers. These may be physically 
separate devices from machine 100, or enclosed as part of a 
45 housing of machine 100 that includes other components. 
Storage medium access circuitry 150 may also be incor- 
porated as part of the functionality of rnachine 100, such as 
when storage medium access circuitry includes communi- 
cations access software and circuitry in order to access the 
so instruction data on data storage medium 130 when data 
storage medium 130 is stored as part of a remotely-located 
storage device, such as a server in a networked client -server 
environment. Software product 120 may be commercially or 
otherwise available to a user in the form of a data stream 
55 indicating instruction data for performing the method of the 
present invention that is transmitted to the user over a 
communications facility from the remotely-located storage 
device. In the latter case, article 120 is embodied in physical 
form as signals stored on the remotely-located storage 
60 device; the user purchases or accesses a copy of the contents 
of data storage medium 130 containing instructions for 
performing the present invention, but typically does not 
purchase or acquire any rights in the actual remotely-located 
storage device. When software product 120 is provided in 
65 the form of a data stream transmitted to the user over a 
communications facility from the remotely-located storage 
device, instruction data stored on data storage medium 130 
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is accessible using storage medium access circuitry 150. image having an appearance to a human viewer of a, 
Alternatively, a data stream transmitted to the user over a reduced-size version of an original text image; the input 
communications facility from the remotely-located storage iconic image including at least one image region 
device may be stored in some suitable local memory device including image definition data defining a plurality of 
of machine 100, which might be program memory 110. or a 5 rectangular blocks each having a foreground color and 
data storage medium locally accessible to machine 100 (not arranged in horizontal lines separated vertically by 
shown), which would then also be accessible using storage image regions of a background color; the rectangular 
medium access circuitry 150. blocks having the appearance of representing text in the 
Data storage medium 130 stores instruction data which is original text image; the image definition data defining 
provided to processor 140 for execution when the method 10 each rectangular block, referred to as an encoded data 
for producing an iconic image version is to be used. The block, having a characteristic property that represents a 
stored data includes document image management operation portion of the binary data- 
instructions 122; when these mstructions are Provided to performing a decoding operation using the image defini- 
processor 140, and processor 140 executes them, the * ^ ^ * dcfmin ftc ^ of bIocks to 
machine ts operated to perform the operations for perform- ^ ^ * ^ 
ing a document image management operation, as repre- % . , J 

sented in box 400 of FIG. 1, or in boxes 540, 640, and 740 performing a document image management operation 

in the flowcharts of the illustrated embodiments in FIGS. 14, usin g me bmarv produced by the decoding opera- 

15, 16, 18 and 19, respectively. tion * 

The stored data further include iconic image decoding 2 - ^ method of claim 1 for operating a processor- 
instructions 124; when these instructions are provided to 20 controlled machine to perform a document image manage- 
processor 140, and processor 140 executes them, the nient operation wherein the original text image represented 
machine is operated to perform the operations for decoding by the iconic image is a content-specific text image; and 
an input iconic image, as represented in box 300 of FIG. 1, wherein the binary data embedded in the rectangular blocks 
and in the flowchart of FIG. 11. and produced by the decoding operation indicates data 

Although not shown in FIG. 21, the stored data stored on 25 related to the content-specific text image represented by the 

data storage medium 130 may further include data indicating iconic image; the document image management operation 

encoding instructions for encoding an iconic image version performing an operation using the data related to the 

of an original image according to the processes described in content-specific text image. 

the previously referenced Iconic Image Encoding applica- 3. The method of claim 2 wherein the content-specific text 

tion; when these instructions are provided to processor 140, 30 image is stored in a memory device; wherein the document 

and processor 140 executes them, the machine is operated to image management operation is a retrieval operation; and 

perform an encoding operation, as represented in various wherein the data related to the content-specific text image is 

flowcharts in the Iconic Image Encoding application. memory location data indicating a data path to a memory 

The present invention performs a document image man- location of the content-specific text image; the document 

agement operation using an iconic, or size-reduced, version 35 image management operation using the memory location 

of an original text image that has embedded in it encoded data to access and obtain the content-specific text image 

binary data indicating data for use by the operation. The from the memory location thereof. 

encoded data is placed in the iconic image in the form of 4. The method of claim 2 wherein the content-specific text 

horizontal lines of rectangular blocks which appear to a image is received as an input text image; wherein the 

viewer to be representative of the text portion of the original 40 document image management operation is a digital signature 

image that they replace. The iconic image is suitable for use verification operation; and wherein the data related to the 

in a wide variety of document processing applications, and content-specific text image is digital signature data indicat- 

several embodiments have been illustrated herein as exem- ing a digital signature for the content-specific text image; the 

plary of the types of uses that may be made of the present document image management operation using the input text 

invention. These uses are characterized by the ability of the 45 image and the digital signature data to authenticate the 

iconic image to function as a document surrogate far the content-specific text image. 

original full size version of the image, or as a document 5. The method of claim 2 wherein the content-specific text 

token for a class of images, to enable unobtrusive, robust image is received as an input text image; wherein the 

portable storage of data related to the image. While the document image management operation is a document 

invention has been described in conjunction with several so authentication operation for authenticating an authorized 

embodiment, it is intended to embrace all modifications and sender of the content-specific text image; and wherein the 

variations that are apparent to those skilled in the art and that data related to the content-specific text image is encrypted 

fall within the scope of the appended claims. data encrypted using a private key in possession of the 

What is claimed is: authorized sender; the document image management opera- 

1. A method for operating a processor-controlled machine 55 tion using the input text image, a public key paired with the 

to perform a document image management operation using private key and the encrypted data to decrypt the encrypted 

an iconic version of a text image, referred to as an iconic data in order to authenticate the authorized sender of the 

image, having encoded binary data embedded therein; the content-specific text image. 

machine including an image signal source for receiving 6. The method of claim 2 wherein image definition data 
image data; memory for storing data; and a processor 60 defining the content-specific text image is provided by the 
connected for accessing instruction data stored in the signal source with the iconic image and is stored by the 
memory for operating the machine; the processor being processor in the memory device; and wherein the data 
further connected for receiving image data from the image related to the content-specific text image is content data 
signal source; and connected for storing data in the memory; related to the information content of the content-specific text 
the method comprising: 65 Image; the document image management operation using the 
receiving image definition data defining an input iconic content data to perform the operation using the content- 
image from the image signal source; the input iconic specific text image. 
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7. The method of claim 6 wherein the document image 
management operation is a character recognition operation; 
and wherein the content data related to the information 
content of the content-specific text image indicates data for 
use in performing the character recognition operation on the 5 
content-specific text image. 

8. The method of claim 2 wherein the iconic image is a 
document surrogate for the content-specific text image; the 
iconic image being used independently from the original 
content specific text image to represent the content-specific 10 
text image; the data related to the content-specific text image 
including identification data indicating identifying informa- 
tion about the content-specific text image. 

9. The method of claim 1 for operating a processor- 
controlled machine to perform a document image manage- 15 
ment operation wherein the original text image represented 
by the iconic image is a content-specific text image stored in 

a memory device; and wherein the binary data embedded in 
the rectangular blocks and produced by the decoding opera- 
tion indicates operation parameter data related to the docu- 20 
ment image management operation; the document image 
management operation performing an operation using the 
content-specific text image stored in the memory device; the 
document image management operation using the operation 
parameter data in performing the operation. 25 

10. The method of claim 9 wherein the document image 
management operation is a distribution operation controlling 
distribution of the content-specific text image; and wherein 
the operation parameter data indicates an authorized desti- 
nation of the content-specific text image. 30 

11. The method of claim 1 for operating a processor- 
controlled machine to perform a document image manage- 
ment operation wherein the iconic image represents a non- 
specific text image indicating a class of document images; 
and wherein the binary data embedded in the rectangular 35 
blocks and produced by the decoding operation indicates 
operation parameter data related to the document image 
management operation; the document image management 
operation r^rforming an operation using a respective one of 
the document images in the class of document images; the 40 
document management operation using the operation 
parameter data in performing the operation. 
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12. An article of manufacture for use in a machine that 
includes a memory device for storing data; a storage 
medium access device for accessing a medium that stores 
data; and a processor connected for accessing the data stored 
in the memory device and for receiving data from the storage 
medium access device; the article comprising: 
a data storage medium that can be accessed by the storage 
medium access device when the article is used in the 
machine ; and 

data stored in the data storage medium so that the storage 
medium access device can provide the stored data to 
the processor when the article is used in the machine; 
the stored data comprising instruction data indicating 
instructions the processor can execute; 

the processor, in executing the instructions, receiving 
image definition data defining an input iconic image 
from the image signal source; the input iconic image 
having an appearance to a human viewer of a reduced- 
size version of an original text image; the input iconic 
image including at least one image region including 
image definition data defining a plurality of rectangular 
blocks each having a foreground color and arranged in 
horizontal lines separated vertically by image regions 
of a background color such that the rectangular blocks 
have the appearance of representing text in the original 
text image; the image definition data defining each 
rectangular block, referred to as an encoded data block, 
having a characteristic property that represents a por- 
tion of the binary data; 

the processor, further in executing the instructions, per- 
forming a decoding operation using the image defini- 
tion data defining the plurality of rectangular blocks to 
produce the binary data encoded therein; 

the processor, still further in executing the instructions, 
performing a document image management operation 
using the binary data produced by the decoding 
operation. 

***** 
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